NeuralSeek Setup
Set Up
- Once you've opened NeuralSeek, you'll land on the "Configure" tab.
- Enter the name of the company or organization that NeuralSeek will be generating answers for. Click "Next"
KnowledgeBase Connection
- Input Watson Discovery KnowledgeBase details. Discovery API Key and Discovery Service URL can be found on the Watson Discovery service instance page. Once you have created a project and a collection in Watson Discovery, the Discovery Project ID can be found on the integration tab under API Information. Once you fill out the KnowledgeBase Connection details, test the connection by clicking "Test". Once tested, click "Next". The button will turn green when it successfully connects to Watson Discovery.
- For Virtual Agent Type, select "Watson Assistant Type". Click "Next". Click "Next".
LLM Details
In the LLM Details, you must add at least one LLM. If you choose to add multiple, NeuralSeek will load-balance across them for the selected functions that have multiple LLM's. For API key, there are two options: SaaS and CP4D. You only need to fill out one or the other. Ensure your endpoint and project id are accurate.
In this case, we are interested in CP4D. The CP4D api key must be a base64 encoded key, and it can be found using the terminal comamnd below. You can generate 'myapikey' by clicking on the profile picture on the top right hand corner and then clicking 'Generate new key'.
printf "myusername:myapikey" | base64
You can confirm if you succeeded in connecting with your zenapikey by using the curl command below:
curl -H "Authorization: ZenApiKey ${TOKEN}" "https://<cpd_instance_route>/<endpoint>"
Credentials from watsonx.ai that you need for LLM details: First make sure you have a project created within watsonx.ai.
- LLM Endpoint
- Go to WatsonX Platform
- Prompt Lab -> View code (Right hand side, to the right of Model) -> Copy the API endpoint after
curl
- LLM Project ID:
- Go to WatsonX Platform
- Projects -> Manage -> General -> Details -> Project ID
Configuration & Tuning
Please see below for recommended settings:
-
Knowledgebase Connection:
- Curation Data Field: text
- Link Field: url
- Document Name Field: extracted_metadata.title
- Attribute sources inside LLM Conext by Document Name: Enabled
- Return the full document instead of passages (only enable this if all of your documents are short): Enabled
-
Knowledge Base Tuning:
- Document Score Range: 0.8
- Max Documents Per Seek: 1
- Document Data Penalty: 0
- Snippet Size: 1000
- KnowledgeBase Query Cache (minutes): 0
-
Prompt Engineering
- Weight Tuning Section
- Temperature: 0
- Top Probability: 0.7
- Frequency Penalty: -0.3
- Maximum Tokens: 0
- Weight Tuning Section
-
Answer Engineering & Preferences
- For "How verbose should an answer be", select second tier using the slider.
-
Intent Matching & Cache Configuration
- Edited answer cache: 3
- Normal answer cache: 5
- Required Cache to Follow Context? Yes
- Required Cache to match the exact KB for the question and not the intent? No
-
Governance & Guardrails
- For the Semantic Score section, please turn on the following using the toggle:
- Enable the Semantic Score Model
- Use Semantic Score as the basis for Warning & Minimum confidence. Do NOT Enable for usecases requiring language translation.
- Rerank the search results based on the Semantic Match
- Semantic Tuning
- Missing key search term penalty: 0.4
- Missing search term penalty: 0.25
- Source jump penalty: 6
- Total coverage weight: 0.25
- Rerank min coverage %: 0.25
- Warning confidence
- Confidence % for warning: 5
- Minimum Confidence
- Minimum Confidence %: 0
- Minimum Confidence % to display a URL: 5
- Minimum Text: 0
- Maximum Length: 100
- For the Semantic Score section, please turn on the following using the toggle:
Testing
- Navigate to "Seek" tab. Test NeuralSeek with questions that are relevant to your documents, e.g. "What products or services do you offer?" You will be able to see the NeuralSeek answer with response details, metrics, and source. Ensure to clear session turn if starting a new session by clicking on the red reset icon:
In addition to testing on NeuralSeek, we have written a script to allow testing through API for more flexibility. We performed Pre-Processing and No OCR, No Pre-Processing and No OCR, and OCR experiments using the testing notebook. You can and run the different experiments just by changing the Discovery collection ID and providing with the questions and expected responses as string arrays. It uses the NeuralSeek API.
UI Testing
Within the NeuralSeek UI, there is also a feature to send a batch of questions to test at once. This can be helpful for users who prefer UI tools, and returns answers in a spreadhseet format. To use this feature, navigate to the "upload test questions" section on the home page:
When submitting your batch of questions, format them using the template provided before uploading them.
Download Logs
- Proceed to API on Integrate tab
- Click Authorize, then paste in API key and then click authorize.
- Scroll to Logs section, click 'Try it out' and then click execute
- After execute, you can download logs in json format by clicking 'Download'